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Multi-scale ("multiresolution") community detection attempts to identify the most relevant divi- 
sions (groups ol related nodes) ol an arbitrary network over a range ol network scales. This task 
is generally accomplished by analyzing community stability in an average sense across all commu- 
nities in the network. In some systems, contending partitions of the global community structure 
may be vague or imprecisely defined, but certain local communities may nevertheless be strongly 
correlated at a given network resolution. We demonstrate a general local multiresolution method 
where we draw inferences about local community "strength" based on correlations between clusters 
in independently-solved systems. We propose measures analogous to variation of information and 
normalized mutual information which quantitatively identify the best resolution(s) at the commu- 
nity level. Our approach is independent of the applied community detection algorithm save for 
the inherent requirement that the method be able to identify communities across different network 
scales. It should, in principle, easily adapt to alternate community comparison measures. 

PACS numbers: 89.75.Fb, 64.60.aq, 89.65.-s 



I. INTRODUCTION 

Applications of complex network analysis span a wide 
range of seemingly unrelated fields. In these networks, 
elements of the model system are abstracted as nodes 
(i.e., people, atoms, etc.), and edges represent known 
relationships between them (i.e., friendships, energies, 
etc.). As depicted in Fig. [lj community detection (CD) 
[JJ [2] seeks to identify natural groups of related nodes 
in a network. This structure can take the form of social 
groups [3], clusters of atoms [4], proteins [5], and much 
more. Several categories of common real-world networks 
are characterized in Ref. [3]- 

This work extends current methods of "global" mul- 
tiresolution CD [5] (see Appendix |A|) to enable quanti- 
tative multiscale evaluation at the local community level 
[?H9] , effectively "zooming" inward or outward in the net- 
work scale depending on the specific node, region, or lo- 
cation (e.g., image segmentation applications HHj)- Our 
local multiresolution replica algorithm (LMRA) quanti- 
tatively identifies the most natural resolution(s) for in- 
dividual communities regardless of the weak or strong 
correlations present in the full network. In essence, the 
LMRA method is able to select optimal values of CD res- 
olution parameter(s) for each cluster in a graph. Here, 
we solve independent copies of the full system, but the 
approach would adapt trivially to other CD algorithms 
which can identify local communities within network sub- 
graphs (i.e., without the need to partition the entire net- 
work) or to other local cluster comparison measures. 

One of the most popular methods of CD defines a cost 
function that attempts to quantitatively encapsulate the 
essential features for a "good" division of nodes, thus 
evaluating the best community structure in an objective 
fashion. Regardless of the specific form, the task is to 
optimize the function for a particular graph to deter- 
mine the optimal node division(s). Newman and Girvan 



[11] introduced the most common approach by far with 
"modularity." CD methods based on Potts model cost 
functions, or methods that may be cast as such [l"2l 1 1 3 j . 
are also common. 

Reichardt and Bornholdt (RB) wrote a Potts model 
[14] which they specialized into two main cases utilizing 
null models. Null models are auxiliary graphs which are 
selected to evaluate the quality of a candidate partition, 
thus implicitly or explicitly selecting the "correct" scale 
for a graph. These methods were shown to suffer from 
an inherent "resolution limit" [ITJ [T4TU7] , which is not 
resolved by varying the network scale [TBI US], making 
it difficult for them to properly identify communities in 
large graphs. 

More Potts model and related approaches include [B]- 
[H 131 HDH23] , and Refs. [El [23] generalized the RB Potts 
models in [T4[ [24] , respectively. Our previous work [6] [7] 
advanced a "local" Potts model, and local models were 
studied in more detail in [5] . Other local methods include 
03 EUl [H [23 HB], including variants of modularity [23 
[28] . Potts systems in CD can experience disorder from 
thermal effects [29] |30] , extraneous edges (noise) [7] I29T 
152"] , and system size [301 133] • The selected model can also 
exacerbate disorder effects [3U [31] ■ 

Some CD methods implicitly select a single "objec- 
tive" scale for a candidate community division (e.g., Refs. 
[TT1 ) , but certain networks such as hierarchical sys- 
tems inherently have multiple natural scales. Hierar- 
chical clustering is an early multiscale method [35], but 
it forces hierarchical structure on every system without 
evaluating the relevance of the solved partitions. More 
recent hierarchical approaches include [36-41 , and Ref. 
[12] relates the presence of hierarchical features to a scale- 
free- network property. 

A CD algorithm should be able to determine all rele- 
vant scales of a network, ideally without ad hoc imposi- 
tions on the network structure, and this problem is the 
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FIG. 1. (Color online) The figure illustrates a network par- 
tition where communities are represented by distinct node 
shapes and colors. The graph includes ferromagnetic [solid, 
black lines with Wij > in Eq. |TJ] and antiferromagnetic 
interactions (gray, dashed lines with Uij > 0), and the line 
thickness indicates the relative interaction strength. With 
Eq. Q, "neutral" interactions (unconnected or undefined re- 
lations) are repulsive in nature since they work like adversarial 
relations that break up well-defined communities. 



impetus for developing quantitative multiresolution net- 
work analysis. Multiscale capable methods that utilize 
cost functions include [5J QH US H3 HISS]. The RB 
Potts model weighs the contribution of the null model 
[Hj , allowing the cost function to span different network 
scales. Other methods encompass varied forms of analy- 
sis UnHMI to attack the problem. 

Even with tunable CD cost function parameters, the 
question of which resolutions are the most natural scales 
for a network is not necessarily answered. Thus, mul- 
tiresolution methods sought to identify the best scale(s) 
[BJ 1431 150] for a network without imposing, or arbi- 
trarily selecting, a preferred network scale. The most 
common method detects "stable" resolutions in terms 
of network and model resolution parameters [5J HSJ 03J . 
Our multiresolution replica algorithm (MRA) calculated 
information-based correlations [6 among independent 
copies of the same system to quantitatively compare the 
partition strength across all relevant network scales. 

To our knowledge, all current multiresolution ap- 
proaches analyze the network robustness in an "average" 
sense across all communities (see Appendices [B] and [C]) 
in a network, but the best local communities will not 
necessarily coincide at the same resolution in general. 
For example, communities in large networks may experi- 
ence a "lost-in-a-crowd" effect which can obscure locally 
well-defined communities and limit the ability of global 
multiresolution methods (see Appendix [Aj to accurately 
isolate their structure. In some models, the effect can be 
exacerbated by heterogeneously-sized community struc- 
ture [3H[5T] depending on the network scale. Conversely, 
a global partition may be strong for most communities, 
but a given cluster may still be weakly defined. 

We combine the benefits of multiresolution analysis 
with the local identification of community structure. 



While each community exists in the context of the sur- 
rounding network, we ideally prefer to identify strong 
communities independent of the global system, allowing 
each community to "stand on its own" in terms of the 
evaluation of community structure. Somewhat related 
efforts include detecting "unbalanced" communities in a 
network partition [Slj and an efficient "seed-expansion" 
method by Havemann et al. [35] which could, in princi- 
ple, be modified for other local cost functions. 

The remainder of the work is organized as follows: we 
introduce our community detection Potts model in Sec. 
m Section |III A| elaborates on concepts of community 
definitions, and Sec. MB describes the notion of a par- 
tition resolution. We suggest a local, community-based 
analogy to the variation of information (abbr., VI) and 
normalized mutual information (NMI) measures in Sec. 
IV which we apply in Sec.|V]for our local multiresolution 
algorithm. Section [VT] illustrates the approach with two 
examples, and we conclude in Sec. |VH] Appendix [X] ex- 
plains the context of local and global terminology used in 
this paper. Appendices [B] and [C] elaborate on our com- 
munity detection and global multiresolution algorithms 
which form the basis of the local analysis presented in 
the current work. Finally, Appendices |D| and [E] comment 
the semi-metric property of our cluster measure and al- 
ternative approaches to local cluster comparisons in an 
information-theoretic analogy. 



II. POTTS MODEL HAMILTON! AN 

Regardless of the underlying solution method, the ulti- 
mate goal of any community detection partitioning algo- 
rithm is a Potts type assignment i — > Ui for each node i 
into one of q different clusters where o~i may be regarded 
as a Potts-type variable. Toward this end, we focus di- 
rectly on Potts variables. Some methods extend this no- 
tion to include "overlapping" memberships (e.g., Refs. 
[SJ [25] [25J [S3J) where nodes may be shared between, or 
fractionally assigned to, different communities. In these 
cases, the community assignment becomes a vector quan- 
tity for each node as opposed to a single integer value. 

We identify community partitions by minimizing (see 
Appendix |b|) a general CD Potts model 

H({<j}) = -- 53 \wijAij-TUii (1 - Ai)]*(°-i.°jO (!) 

which we refer as an "absolute" Potts model (APM) since 
it is not defined relative to a null model. Assuming N 
nodes, {-Ajj} is the adjacency matrix where Ay = 1 if 
nodes i and j are connected and is if they are not con- 
nected. As mentioned above, the spin variable o~i iden- 
tifies the community membership of node i in the range 
1 < o~i < q where node i is a member of community k if 
a - , = k. The Kronecker delta 5((Jj, aj) = 1 if (jj = aj and 
when er^ ^ aj . By virtue of the Kronecker delta, inter- 
actions are limited to spins in the same community, and 
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they are ferromagnetic in nature if nodes i and j are con- 
nected and antiferromagnetic if they are not connected. 

In Eq. ([IJ), {tVij} and {uy} are the edge weights for 
"cooperative" and "neutral" or "adversarial" relations, 
respectively. In unweighted graphs, a.y = 6y = 1. Both 
adversarial and neutral relations serve to break up com- 
munity structure, so the APM [5J [7J penalizes neutral 
relations much like one would expect for adversarial re- 
lations (as opposed to zero energy contributions as in a 
purely ferromagnetic Potts model O [2Q]). This prop- 
erty avoids a trivial ground state solution (i.e., a com- 
pletely collapsed system) present in the purely ferro- 
magnetic Potts model, providing an alternative "penalty 
function" to how modularity resolved the problem [IT] . 
Ref. [23J generalized a common Potts model variant [T3] 
to include "negative" link weights. A network resolution 
roughly corresponds to the typical community size, but 
it is better characterized by a typical community edge 
density (see Sec. Ill B I . The global resolution parameter 



7 in Eq. (TTT) scales the relative effects of the ferromag- 
netic {wij} and antiferromagnetic {?%} interactions, ef- 
fectively allowing the model to vary the network scale, 

Despite the global energy sum, the model is a local 
measure of community structure (see Appendix [A]) be- 
cause all node assignments are made strictly by evaluat- 
ing local network parameters [7J [5] . For simplicity, our 
current analysis will focus on undirected, static networks; 
but both Eq. ([IJ) and the LMRA method in this work 
are suitable for general weighted, directed, and dynamic 
(time-dependent) networks. 



comprehensive network should not intuitively disturb the 
natural communities provided they are strongly defined 
relative to any structure in the expanded system. 

Ref. [54] proposed definitions for "strong" and "weak" 
communities: in a strong community, all nodes have 
more internal than external edges, and a weak commu- 
nity is one where the sum over all internal edge edges 
exceeds the sum of the external edges. A large social 
network, such as that mentioned above, may not have 
"strong" or even "weak" communities in the sense of the 
proposed definitions, but the communities are still well- 
defined empirically. Thus, these community definitions 
|54) neglect certain important (high noise) and intuitive 
[171 152] cases. 

Further, several CD methods compared by Lanci- 
chinetti and Fortunato [58j demonstrated that even weak 
communities as defined are not restrictive or character- 
istic of the capabilities of some CD algorithms. That is, 
the best methods easily solved the benchmark graphs [5TJ] 
into regions where all nodes (on average) have more ex- 
ternal than internal edges. With these examples in mind, 
it seems appropriate, at least in social and related net- 
works, to favor cost functions or analysis methods that 
utilize pairwise community comparisons when evaluating 
node membership robustness. This assumption inher- 
ently affects the notion of well-defined partitions, com- 
munities, and individual node memberships [71l56j. With 
this in mind, it may be fruitful to pursue a community 
definition based on edge density as opposed to inner and 
outer community edge counts, but a quantitative analysis 
is beyond the scope of the current work. 



III. COMMUNITY DETECTION CONCEPTS 

A precise definition of community structure in net- 
works is still not agreed upon in the literature. Gen- 
erally speaking, communities consist of nodes which are 
strongly connected internally, in terms of the number or 
weight of edges, but those between communities are more 
sparsely connected. There is a question as to whether 
the "inner" versus "outer" degree comparison is summed 
across all external communities [HI |55] or is evaluated 
between individual pairs of communities [5J [7J 156) . 

A. Community definitions 

Communities in social networks are the prototypical 
CD model. People often have many more "external" re- 
lationships of varying strengths than they do within their 
local group where they are a "member." For example, an 
individual may associate with a chess club, but his net- 
work of friendships may extend to dozens or even hun- 
dreds of people beyond their local group. In many net- 
work approximations (e.g., the ubiquitous Zachary karate 
club network [57] ), these "extra" edges are omitted as 
extraneous in a reduced-size network, but the additional 
"noise" induced by including these relations in a more 



B. Resolution 

Intuitively, the resolution of a community partition is 
the typical strength of intracommunity connections. This 
concept can be quantified by the typical edge density p 
of the communities in the partition. Communities with 
significantly different edge densities are qualitatively dif- 
ferent. For example, social networks naturally display 
communities of "close friends" or "acquaintances." Close 
friends are generally very likely to know most or all mem- 
bers of the same group (p is high) where acquaintances 
are much less likely to know each other (p is lower). 

As a specific example, a community where each per- 
son has five friendships in a group of six is a "perfect 
clique." That is, every node is connected to all others in 
the group. However, if we consider the same five friend- 
ships in a group of 100, it may not even qualify as a 
community of social acquaintances. These two clusters 
have an identical edge count, but they represent drasti- 
cally different types of communities (i.e., different net- 
work scales). As mentioned above, the inner and outer 
edge count is not sufficient to quantitatively describe a 
cluster. This distinction highlights the importance of a 
penalty term in various CD quality functions. 

In practice, a partition will contain communities with 
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a range of edge densities, but intuitively, the differences 
should not be drastic at a given resolution since the par- 
tition should manifest communities with similar "levels 
of association." Continuing with the social network ex- 
ample, mixing communities of close friends and acquain- 
tances in the same partition makes less sense than a par- 
tition that indicates close friendships in most communi- 
ties. Given this argument, it is reasonable that a given 7 
in Eq. ([I]) could be applied to the whole graph and pro- 
vide meaningful partition information in general, but this 
manuscript illustrates a method to enhance the analysis 
of complex networks by finding locally optimal resolu- 
tions at the community level. 

We specialize the edge density analysis below to 
unweighted graphs for clarity, but Ref. [7] discusses 
weighted graphs in the same context. The edge density 
of community a is p a — ^ a /^™ ax where £ a is the number 
of edges in the community. £™ ax = n a (n a — l)/2 is the 
maximum number of possible edges in community a with 
n a nodes. The global resolution parameter 7 in Eq. ([I]) 
requires a minimum edge density for each community in 
the partition, 



> 



7 + 1 



(2) 



which we calculate by determining the minimum den- 
sity configuration that yields an energy of zero or less. 
Without 7, the model can only solve a particular implicit 
resolution for all systems, p^Tn ^1/2- Other models im- 
plement similar weight parameters Q31 HU [23H251 [43] 
which allow the models to solve distinct network scales. 

While Eq. |2]) provides a convenient lower bound on 
the minimum community edge density, optimizing Eq. 

implements the constraint by enforcing a stronger 
requirement. That is, it merges network elements (a node 
to a community or two communities) if the edge density 
between them exceeds p m in- Thus, one is assured that all 
sub- elements of a community are connected by at least 
Pmin- This avoids situations where a minimal number 
of connecting edges merge internally dense sub-graphs 
in order to arbitrarily satisfy the cost function. It also 
avoids resolution- limit-type effects by acting locally [7j. 



A. Partition correlations 

To define VI and NMI, we select a random node from 
partition A and note that it has a probability P(k) = 
Hk/N of being in community fc where is the number 
of nodes in the community. The Shannon entropy is 



y ' L— 1 N N 

fc=i 



(3) 



where qA is the number of communities in partition A. 
The mutual information I(A, B) between two partitions 
A and B evaluates how much we learn about A if we know 
B. In practice for our application, contending partitions 
(A, B, . . ., X) are defined as independent copies of the 
system. 

We define a "confusion matrix" for partitions A and 
B which specifies how many nodes n a b in community a 
of partition A are also in community b of partition B. 
Mutual information is 



I(A,B) 



qA qB 

EE 

1=1 j=i 
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N 



log 



n ab N 
n a n b 



(4) 



where n a (n ) is the number of nodes in community a (b) 
of partition A (B). The variation of information V(A, B) 
metric is then 



V(A, B) = H(A) + H(B) - 2I(A, B) 



(5) 



which measures the information "distance" between par- 
titions A and B with a range of < V (A, B) < log N. 
We use base 2 logarithms. 

Some analysts prefer a normalized information mea- 
sure |61| for partition similarity 



U(A,B) 



2I(A,B) 
H{A) + H(B) 



(6) 



NMI and VI are closely related, U(A,B) = 1 - 
V(A, B)/[H{A) + H(B)}. While NMI is a valuable mea- 
sure of partition similarity, it is not a formal metric (see 
Appendix [D]) on partitions A and B in part because 
U(A, A) = 1 not 0. 



IV. INFORMATION MEASURES 

Information measures have received broad acceptance 
for comparing candidate CD partitions. Commonly used 
measures include the variation of information [60] and 
normalized mutual information [61] . We leveraged the 
measures in Sec. |IV A] to identify the best global network 
scales via a multiresolution replica method [B] (see Ap- 
pendices [A] and [C]) . 



B. Local information analogies 

In defining a cluster comparison measure, we wish 
to maintain consistency with the trend in CD towards 
information-theoretic partition evaluations. If we were to 
compare larger (multi-cluster) sub-graphs, a natural ap- 
proach is to cut the subgraph from the whole network and 
compare the reduced-size partition. This breaks down at 
the cluster level because there is no partition-of-unity as- 
sociated with an individual cluster as is used to define 
NMI or VI for CD. 
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FIG. 2. (Color online) The figure schematically depicts r inde- 
pendent solvers ( "replicas" ) as spheres navigating the energy 
landscape of Eq. Q. Stronger agreement among the replicas, 
as measured by information correlations in Sec. |IV A[ indi- 
cates a more accurate global solution. In this manuscript, we 
demonstrate that local communities may be strongly defined 
even if all the communities in the global system are weakly 
correlated (see Fig. |3|. 



Nevertheless, we can envision comparing any pair of 
clusters independent of the global system, but imple- 
menting an arbitrary measure is difficult in this context. 
Therefore, we consider the cluster embedded in the full 
system of N nodes, giving it a context for the resulting 
cluster-level entropy or information content based on the 
associated partition-of-unity probabilities. As will be ev- 
ident below, strictly speaking we need not actually use 
the true size of the network for our cluster comparisons. 
That is, we could use some other N' 7^ N, but it is con- 
ceptually appealing to evaluate a cluster in the context 
of the full network. 

From Eq. ([3|, the entropy contribution of community 
a in partition A is 



H a (A) 



n. 
'iV 



(7) 



where n a (n b ) is the number of nodes in community a. 
Similarly, Eq. Q indicates the mutual information con- 
tribution when comparing cluster a in partition A (a, A) 
to cluster b in partition B (b, B) 



I ab (A, B) 



n a b , ( n ab N 
— log 



(8) 



In analogy with Eq. (pH), we introduce the cluster 
tion of information (CVI) v(a, b) 

v(a,b) = H a (A) + H b (B) - 2I ab . 



vama- 



(9) 



CVI exhibits appealing "distance-like" properties of a 
semi-metric for comparing clusters (a, A) and (b, B) (see 
Appendix [T5| for a trivial proof). Summing over all pairs 
of clusters a and b, VI is related to CVI by 



qA qs 



V(A, B) = ^2J2v(a,b)- (q B - l)H(A) - (q A - l)H(B). 

a b 

(10) 



Appendix [E] provides additional remarks. 

From Eq. ([6| , we introduce the natural cluster normal- 
ized mutual information (CNMI) analogy 



2n n 



u(a, b) 



l°g log (£) 



(11) 



While CNMI is not a metric [in part because u(a, a) = 1 
not 0], it has the same intuitive property of cluster sim- 
ilarity that makes NMI attractive for partition compar- 
isons. Equation (111 is essentially a normalized variant 
of CVI, u(a,b) = 1 - v{a,b)/[H a + H b }. On smaller 
networks, CVI provides a clearer picture of transitions 
with its distance-like semi-metric properties, but CNMI 
is more easily evaluated for larger networks because vari- 
ations in CVI become small as N becomes large. 



V. LOCAL MULTIRESOLUTION ALGORITHM 

Our local multiresolution algorithm isolates relevant 
local multiresolution order (well-defined local communi- 
ties). We invoke via, b) in Eq. ^ and u(a, b) in Eq. ( 11 ) 
to compare local clusters a and b across r "replicas" (In- 
dependent solutions). Figure |2] depicts the basic MRA 
']6\ algorithm given in Appendix [C] The LMRA method 
depicted in Fig. [3] extends the MRA method by incorpo- 
rating comparisons between specific clusters. 



A. LMRA replica method 

In general, clusters naturally change as the resolution is 
varied, so how do we identify the appropriate target clus- 
ters for comparison? Two natural approaches include: 
compare clusters for "nearby" resolutions as specified by 
a particular 7, in Eq. ([!]) or compare targeted ( "parent" ) 
clusters for specific node(s) of interest across the repli- 
cas. In the latter case, the node may be selected a priori 
based on a particular identity, or it may be randomly se- 
lected. One may also first analyze the global system and 
"work backwards" to identify relevant nodes as members 
of communities with interesting features. 

In the first case, if one deviates too far from 7,, the 
cluster will change substantially and the evaluation will 
be less useful. That is, at some point, the cluster changes 
enough that it is no longer the "same" community. We 
could quantitatively define this comparison based on the 
relevant CVI values. 

The latter option is used in the current work where 
we select a node of interest (e.g., a specific terrorist as 
in Sec. VI B[ ), and trace the parent clusters among the 
replicas across a range of network scales [i.e., different 
7i's in Eq. 0]. This option has two advantages: it is 
simpler to implement, but more importantly, the studied 
clusters are always well-defined, enabling comparisons 
of community robustness across all relevant resolutions. 



G 




L U 1 



_ L_ ■ ■ l_ L_ L_ 





(3) 



FIG. 3. (Color online) The figure illustrates our local multiresolution algorithm discussed in detail in Sec. [V] The graphs 
include ferromagnetic ["cooperative" with Wij > in Eq. |l])] relations depicted by solid, black lines and antiferromagnetic 
("neutral" or "adversarial" with «y > 0) interactions depicted by gray, dashed lines. The line thickness indicates the relative 
interaction strength, and we omit intercommunity adversarial and neutral relations for clarity. In step (1), we independently 
solve a series of r "replicas" of the community detection problem (although we could, in general, improve the efficiency by 
solving only the local communities embedded in the network). Step (2) identifies the target node(s) of interest (solid red circles) 
and their corresponding "parent" clusters (blue dashed circles). Depending on the application, we could alternately calculate 
the correlations among all pairs of communities and determine whether the individual clusters are strongly or weakly defined. 
Step (3) uses Eqs. (JoJ) and (III to calculate correlations among all pairs of parent clusters in order to determine the community 
robustness at the current resolution specified by 7 in Eq. 



That is, at a given 7$, we only need to know what cluster 
to which node i belongs, regardless of any structural 
changes in its network neighborhood as 7 is varied. 
Cluster correlations are quantitatively evaluated at a 
given 7j, but the average v(a, b) or u(a, b) measures over 
the replica pairs can be compared across different 7, 's to 
evaluate the relative strength of the parent communities. 

As depicted in Fig. [3j the LMRA algorithm is: 

(0) Initialize the algorithm. Select the number of repli- 
cas r and the number of independent optimization trials 
t per replica. Select a set of nodes {a} to track based on 
problem parameters (e.g., a person of interest in a terror 
network in Sec. VI B[ ). Identify the set of resolutions {7^} 
to analyze (often selected to sample all relevant network 
scales, see step 4 in Appendix |C]) by minimizing Eq. § 
Select a starting 70. 

(1) Solve r independent replicas. For the current 7, in 
Eq. Q, apply steps (l)-(3) of the global MRA algorithm 
in Appendix |C| 

(2) Identify parent clusters. Identify the parent cluster 
aij corresponding to each target node a in each replica j 
at the current 7j. 



(3) Compare clusters. For each parent cluster a l0 , 



cal- 



culate CVI v(a,b) in Eq. ([o]) and CNMI u(a,b) in Eq. 
(Ill with the corresponding parent cluster a-ik in replica 



k. Calculate the average of measure Si [w(a,6), u{a,b), 
etc.] over all replica pairs at 7^ by 



Si (a, b) 



r(r — 1) 



— ^S ljk {a,b) 



(12) 



k>j 



where i refers to a particular resolution parameter index 
for 7i in Eq. (JlJ, and j and k refer to replica summations. 

(4) Identify the best resolutions. For each parent 
cluster aij, find the lowest CVI values v(a,b) or the 
highest CNMI values u(a, b) and their corresponding 
resolution(s) {7f cst } C {ji}- These are the best resolu- 
tions for each cluster Oy. 

As with the global MRA approach in Appendix [C] we 
are interested in extrema or plateaus in the pertinent 
measures in Sec. IV Empirically, r ~ O(10) or less ap- 



pears to be sufficient for most problems. We estimate the 
cost to be 0(Lr 2 ) which is comparable to the base MRA 
algorithm cost in Appendix [C] 



B. Alternative implementations 

In the current work, we contrast local, community-level 
analysis with global multiresolution correlations. Thus, 
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FIG. 4. (Color online) The figure depicts a constructed 
N — 1024 node four-level hierarchy. Level 1 is the complete 
network with two "sides" of supercommunities that are ran- 
domly connected at a low edge density between them. Level 
2 consists of two roughly equal sized branches (Nl = 502 
and Nr = 522) which we denote by "left" (L, blue or darker 
tone) and "right" (R, silver or medium tone) as the picture 
indicates. Level 3 is the set of supercommunities, and level 
4 is the set of smallest communities strictly contained within 
the supercommunities. At levels 3 and 4, elements of the left 
branch are connected at higher internal and intercommunity 
edge densities than the corresponding right branch elements. 
See the text for a more detailed description of the network. 
This construction results in a more "blurred" global multires- 
olution signature in Fig. [5ja) where level 4L is lost in the 
global MRA plot at feature (iv). The corresponding LMRA 
plot for node 951 in Fig. [(Jc) is nevertheless able to clearly 
identify level 4L as a strongly defined resolution. 



in this algorithm, we solve the full system and select the 
appropriate parent clusters for the community-level anal- 
ysis. Since the only global parameter that we need to 
evaluate CVI or CNMI is the system size N, a more ef- 
ficient approach could take advantage of our local cost 
function in Eq. (see also Ref. for a more efficient 
method applied a different fitness function |25j). Specif- 
ically, we would solve for the target communities around 
a particular node of interest by examining community 
membership opportunities strictly for the neighbors of 
nodes in or connected to Oj's local neighborhood. The 
remainder of the graph partition need not be specified in 
detail to apply Eqs. ^ and (111. 

A more comprehensive alternative in step (3*) is use- 
ful if there are no a priori nodes of interest to study. 
We could compare all pairs of clusters and identify the 

minimum 



best matching cluster bik for a™ based on the 



(a, b) at the current 7$. Then we would average CVI 



v Uk) 

over all cluster matches for each best cluster pair. In 
this scenario, we could further pursue the relative cluster 
comparisons among the replicas by evaluating whether 
the best clusters match among themselves. That is, we 
would determine if b^ of partition A also matches the 
parent cluster du in partition B, repeating the process to 



the desired depth. 

With this alternate step (3*), individual community 
matches among the r replicas (see Fig. J3J) are not nec- 
essarily symmetric. That is, while Eq. m§ is symmetric 
in (a, A) and (b,B), this does not require that the best 
matching clusters in the respective partitions necessar- 
ily agree. Consequently, it would provide an additional 
measure of community robustness based on the level of 
mutual agreement (number of agreed matches compared 
to the total possible matches among all replicas). 



VI. EXAMPLES 

As discussed in Appendix [C] we calculate the global 
MRA algorithm for the network and concurrently ap- 
ply the LMRA algorithm in Sec. [V] to targeted nodes by 
tracking the respective parent clusters across a full range 
of relevant network scales. Comparing explicit values of 
VI and CVI is difficult, so we evaluate relative values 
of VI or CVI for a given network. We demonstrate the 
LMRA method with a constructed network example and 
a small, real terror network. 



A. Branched hierarchy 

We construct a branched, strict hierarchy as depicted 
in Fig. [4] which we use to test the LMRA method of Sec. 
|V} Level 1 is the full system of N = 1024 nodes; level 2 is 
the two-part branch split (groups of superclusters) with 
Nl = 502 and N R = 522 nodes for the left (L) and right 
(R) sides, respectively; level 3 is the set of superclusters; 
level 4 is the set of innermost clusters. 

Level 1 was defined by connecting nodes in the left 
and right branches (levels 2L and 2R) with an in- 
tercommunity density p\ = 0.015. The approximate in- 
iracommunity edge densities at level 4 were p^ = 0.9 
and P4R = 0.6 assigned randomly with a normal distri- 
bution of dp = 0.02. We connected nodes between the 
respective communities in the intermediate levels 2 and 
3 with probabilities: p^ = 0.37, p^ = 0.10, P2L = 0.16, 
and P2R = 0.03. These values were selected in order to 
demonstrate a somewhat "blurred" multiresolution sig- 
nature in a controlled example where the underlying local 
structure is nevertheless strongly defined. 

In Fig. |5ja), we show the global MRA algorithm from 
Ref. [6] (summarized in Appendix [C]) applied to the full 
N = 1024 node network using r — 20 replicas and 
t = 10 optimization trials per replica. A more thor- 
ough discussion follows, but briefly, feature (iv) illus- 
trates how poorly-correlated communities almost com- 
pletely obscure the well-defined level 4L structure. Nev- 
ertheless, the local MRA algorithm in Sec. Wj can fully 
extract this hidden section of the hierarchy. 

I n Fig. [5} the left axes plot NMI, U, and VI, V, from 
Sec. |IV A] in the top and bottom sub-panels, respectively, 
averaged over all replica pairs. On the right axes, we 




FIG. 5. (Color online) In panels (a), we apply our global multiresolution algorithm (MRA, see Appendices [A] and |c| to 
the iV = 1024 node, four-level, "branched" hierarchy depicted in Fig. [4] Panels (b) and (c) show the MRA method applied 
separately to the left and right level 2 hierarchy branches, respectively. In the top sub-panels (a-c), we compare replica 
partitions using normalized mutual information U (left axes, see Sec. |IV A I and mutual information / (right axes). In the 
corresponding bottom sub-panels, we plot variation of information V (left axes) and the Shannon entropy H (right axes). We 
also plot the average number of communities q (offset right axes) in top and bottom sub-panels. Features (i)-(iii) demonstrate 
that the global MRA algorithm can detect network- wide stable partitions [BJ. Feature (iv) in panel (a) shows that the level 4 
community structure on the left side, known to be present at feature (4L) in panel (b), is almost completely obscured because 
the right branch is significantly more random at the same network scale [i.e., value of 7 in Eq. |T]), see also Sec. |HIB] , In Fig. 
[6j we compare parent communities using the local multiresolution algorithm in Sec. [V] where we demonstrate that the method 
can accurately extract level 4L for the targeted nodes. 
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FIG. 6. (Color online) In panels (a-c), we apply our local multiresolution algorithm (LMRA) in Sec.|v]to targeted nodes of the 
the N — 1024 node, four-level, "branched" hierarchy depicted in Fig. [4] The top sub-panels compare targeted communi ties in 
the solved replicas (independent solutions) using the "cluster normalized mutual information" u(a,b) (left axes, see Sec. IV B I 
and the mutual information contribution I a t- The corresponding bottom sub-panels plot the "cluster variation of information" 
v(a, b) (left axes) and the Shannon entropy contribution H a (right axes). Both top and bottom sub-panels also plot the average 
number of nodes n in the respective parent communities on the offset right axes. The LRMA method is easily able to extract 
the relevant levels 3 and 4 for the target nodes as evidenced by regions of low CVI (or high CNMI) even though level 4L of the 
hierarchy is almost completely obscured at feature (iv) in the combined global MRA plot in Fig. pTa). 
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(a) Terrorist network — 7 = 0.1 



(b) Expanding network around Mohamed Atta 



FIG. 7. (Color online) The figure depicts a small terrorist network collected from publicly available data [62]. Panel (a) 
shows the overall network at 7 = 0.1 in Eq. |l| where distinct node shapes indicate separate communities. Panel (b) shows 
an "expanding" community around Mohamed Atta where his "local" cluster grows roughly outward in the diagram. Here, 
new node categories (shapes and colors) indicate nodes added to the parent cluster (as opposed to new communities) as 7 is 
lowered to particular well-defined resolutions (see text). In this network, our local multiresolution algorithm indicates that 
these communities are strongly defined on an individual basis with CVI v(a,b) = in Fig. |9|b) even at resolutions where 
the overall system structure is more vaguely defined in Fig. [8] This illustrates the main benefit of our local multiresolution 
approach. 



plot the average mutual information / and the Shannon 
entropy H for top and bottom sub-panels, respectively. 
The right offset axes in both sub-panels plot the aver- 
age number of communities q. Panels (b) and (c) show 
the MRA results applied to the separate left and right 
branches of the hierarchy, respectively, using the same r 
and t as in panel (a). 

Features (i)-(m) in panel (a) illustrate how the global 
MRA signature can identify preferred or stable resolu- 
tions by low VI or high NMI correlations (or plateaus 
in H, /, and q in this example) averaged between the 
independently-solved replica partitions. Specifically, fea- 
ture (i) corresponds to level the 2 partition with = 2, 
and feature (ii) identifies levels 2L and 3R with qa = 11 
concurrently because of the respective community edge 
densities (see Sec. IIIB). Similarly, feature (Hi) solves 
levels 3L and 4R with qm = 52. These particular parti- 
tions consist of combinations of well-resolved sub-graphs 
at different levels of the branched hierarchy, but it is the 
loss of level 4L in the global MRA plot that is the main 
topic of this example. 

At feature (iv) in panel (a), the poor correlations show 
that the global analysis of the full system misses level 
4L. This occurs because the well-defined local clusters 
conflict with more random partitions for the right-side 
subgraph in Fig. |ij In contrast, panels (b) and (c) show 
that the MRA method applied to the separate left and 
right branches are perfectly defined with V — and U = 
1 [marked by (2L), (3L), . . ., (4R), respectively]. That is, 
the structure clearly exists locally, but the global MRA 



method in panel (a) cannot resolve level 4L. 

In Fig. |6ja-c), we plot the results of the new LMRA 
method from Sec. |IVB| for the parent clusters of nodes 
116, 661, and 951, respectively, as identified within the 
full N — 1024 node system. On the left axes, we plot 
CNMI u(a,b) in Eq. ^ and CVI v(a,b) in Eq. (j9j), 



respectively, averaged over all community pairs in the re- 
spective replicas. On the right axes, we plot the mutual 
information contribution I a b in Eq. Q and the Shannon 
entropy contribution H a in Eq. (7j) averaged over all pairs 
of target communities in the replicas or all target commu- 
nities, respectively. The offset right axes plot the average 
number of nodes n over all targeted communities. 

Despite being buried within the full N = 1024 node 
system, the parent cluster of node 951 corresponding to 
level 4L is clearly present in the LMRA analysis in Fig. 
|6^b,c). This illustrates how our LMRA algorithm can 
resolve relevant local structure even when the global sig- 
nature is obscured. In principle, we could further apply 
the LMRA algorithm to all clusters in the partitions and 
unambiguously identify the entire set of well-defined level 
4L communities. 



B. Small terrorist network 

Even small networks can experience strongly-defined 
local clusters among indistinct global resolutions. We 
apply the LMRA method to a small terrorist network 
constructed from publicly available data [55] ■ Given that 
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(a) Global MRA: 9/11 Terrorists 

FIG. 8. (Color online) We apply our multiresolution algo- 
rithm (see Appendices [A] and [c| to a small terrorist network 
|62| . Although the plot shows a "best" resolution at 7 ~ 0.1 
(depicted in Fig. [7f as indicated by V ~ 0, the remainder 
of the plot has a largely "blurred" multiresolution signature 
(high VI or low NMI). The V = region on the far left is 
an essentially trivial partition into nearly disjoint clusters. In 
Fig. [9j we show results from the "local" multiresolution al- 
gorithm in Sec. [V] to three selected terrorists where we track 
the respective parent clusters over a range of resolutions [i.e., 
values of 7 in Eq. |T|] and calculate the cluster correlations 
using the CVI and CNMI in Sec. |lVBl 



the highest quality intelligence would be classified, our 
purpose here is to demonstrate the practical application 
of the LMRA on real data as opposed to setting forth a 
rigorous study of the terror network. 

Figure [7]ja) depicts the network at 7 = 0.1 in Eq. ([!]) 
corresponding to the minimum VI at feature (i) in Fig. 
[8] with V ~ (see below). Here, distinct node shapes in- 
dicate separate communities. The community partitions 
with V = at the lowest 7 settings are unimportant dis- 
joint collapsed clusters. The left axes plot U and V (see 
Sec. IV A I for top and bottom sub-panels, respectively, 



averaged over all replica pairs. On the right axes, we plot 
/ and H for top and bottom sub-panels, respectively, and 
the offset axes in both sub-panels plot the average num- 
ber of communities q. 

Figure[7]jb) shows the expanding network core centered 
on Mohamed Atta at several strongly-defined resolutions 
in Fig. |9jb) with v(a, b) — 0. In this panel, distinct node 
shapes and colors indicate added nodes [as opposed to 
new communities in panel (a)], roughly spreading out- 
ward, as 7 is lowered. Specifically, the fixed resolutions 



correspond to 7 = 10 (smallest, innermost cyan cir- 
cles), 7 = 3 (yellow square), 7 = 0.6 (green diamonds), 
7 = 0.3 (red triangles), 7 — 0.125 (dark blue circles), and 
7 = 0.05 (largest, pink squares) with a few other small 
fluctuations not depicted. 

On the left axes in Fig.^a-c), we plot CNMI u(a,b) 
in Eq. ( 11 ) and CVI v(a, b) in Eq. (j9J), respectively, aver- 
aged over all pairs of parent communities in the respec- 
tive replicas. Similarly, the right axes plot the mutual 
information contribution 7 j in Eq. Q and the Shannon 
entropy contribution H a in Eq. |7]) averaged over all pairs 
of parent communities or all parent communities, respec- 
tively. The right offset axes display the average number 
of nodes n over the parent communities. 

Each panel shows distinct, but different, regions of 7 
where the parent clusters are strongly defined, but the 
cluster correlations in the full network in Fig. [8] are more 
poorly defined at most resolutions. Hani Hanjour has a 
LMRA signature distinct from Mohamed Atta for 7 > 
1, but they match at lower 7 because they are mutual 
members of the same communities. 



VII. CONCLUSION 

Multiresolution network analysis extends the basic no- 
tions of community detection to select the best reso- 
lution^) for a given network over a range of network 
scales. Certain networks may present situations where 
local clusters experience a lost-in-a-crowd effect. Despite 
being strongly defined, the local structure may be "lost" 
among a collection of more poorly defined communities 
at a given resolution. This may occur due to the sheer 
size of a network or because most clusters do not coalesce 
in their strongest state(s) at the same scale(s). 

We presented an extension of an existing global mul- 
tiresolution method [5] to detect and quantitatively as- 
sess local multiresolution order. We proposed cluster- 
level analogies to variation of information and normalized 
mutual information which evaluate the strength of local 
communities in the context of a pair of network parti- 
tions. We applied these measures to evaluate correlations 
among individual parent communities in multiple inde- 
pendent solutions (replicas), and we demonstrated that 
the proposed local multiresolution algorithm is able to ex- 
tract local structure despite a blurred global multiresolu- 
tion signature. Our approach is independent of the search 
algorithm or community detection model making it suit- 
able for use with any community detection method that 
can identify partitions across different network scales. 
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(a) LMRA: Hani Hanjour 



(b) LMRA: Mohamed Atta 



(c) LMRA: Zacarias Moussaoui 



FIG. 9. (Color online) In each panel, we apply our local multiresolution algorithm (LRMA, see Sec. |v]l to a small terrorist 
network [62]. We analyze three selected terrorists by tracking the respective parent clusters over a range of resol utions [i.e., 
values of 7 in Eq. dip]. We then calculate the cluster correlations using the community comparison measures in Sec. IV B Note 



that the individual nodes possess certain strongly preferred resolutions with v(a,b) = for their parent clusters whereas the 
global system in Fig. [8] is less well-defined for most values of 7. 
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Appendix A: Local and global terminology 

The meaning of the terms "local" and "global" depends 
on the context. For our purposes, global cost functions 
are those that require network wide (global) parameters 
(e.g., number of edges L, number of communities q, over- 
all graph density p, etc.) in the quantitative evaluation 
of community structure [111 |14j . Global multiresolution 
methods are those for which the best partition is simul- 
taneously determined for the entire system, effectively 
"averaging" the partition robustness over all communi- 
ties. This is true regardless of whether the cost function 
is itself local or global in nature. 

Local cost functions [BH5] or algorithms |12] utilize pa- 
rameters only in the neighborhood of a community or 
node (e.g., size of community a, edges of node i, etc.) 
to evaluate the best community structure. These can be 
subdivided into "weak" and "strong" local cost functions 
[7] where weakly-local cost functions may depend on the 
details of the community structure. Local multiresolu- 
tion methods, such as the current work, seek to identify 
the best communities based on their strength at a given 
resolution. That is, the evaluation of the best resolution 
is not effectively "averaged" over all the communities in 
the graph, and each community may be strongly resolved 
at different network scales (often described in terms of 
distinct model weighting parameters). 



Appendix B: Community detection algorithm 

Our greedy CD algorithm dynamically "moves" nodes 
into the community that best lowers the local energy 
according to Eq. ([I]) given the current state of the system 
{o-i}. The process iterates through the nodes until no 
further nodes are available. Typically, O(10) iteration 
cycles through all N nodes are required except in rare 
instances that lie in or near the "hard" (or "glassy") 
phase [Ml 130]. 

The CD steps are: 

(0) Initialize the system. Initialize the connection ma- 
trix Aij and edge weights uiij and Uij. Determine the 
number of optimization trials t. 

(1) Initialize the clusters. The initial partition is usu- 
ally a "symmetric" state wherein each node is the lone 
member of its own community {i.e., qo = N). 

(2) Optimize the node memberships. Sequentially se- 
lect each node, traverse its neighbor list, and calculate 
the energy change that would result if it were moved into 
each connected cluster (or an empty cluster). Immedi- 
ately move it to the community which best lowers the 
energy (optionally allowing zero energy changes). 

(3) Iterate until convergence. Repeat step (2) until 
a (perhaps local) energy minimum is reached where no 
nodes can move. 

(4) Test for a local energy minimum. Merge any 
connected communities if the combination lowers the 
summed community energies. If any merges occur, re- 
turn to step (2) and attempt additional node-level re- 
finements. 
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(5) Repeat for several trials. Repeat steps (1)— (4) 
for t independent "trials" and select the lowest energy 
result as the best solution. By a trial, we refer to a 
copy of the network in which the initial system is ran- 
domized in a symmetric state with a different node order. 

The optimal q is usually dynamically determined by 
the lowest energy state although the algorithm can also 
fix q during the dynamics. Empirically, the computa- 
tional effort scales as 0(tL 13 logfc) where k is the aver- 
age node degree and log k is from a binary search im- 
plemented on large sparse matrix systems. This greedy 
variant can accurately scale to at least O(10 9 ) edges [7]. 
We can extend it with a stochastic heat bath 29j solver 
or a simulated annealing algorithm [T3] at the cost of 
significantly increased computational effort. 



Appendix C: Global multiresolution algorithm 

As depicted in Fig. [2j our multiresolution algorithm 
iteratively applies the CD algorithm in Appendix [B] to 
quantitatively evaluate the best community partitions 
over a range of network scales. In its basic form, we in- 
dependently solve the CD problem for a given graph over 
a range of 7 in Eq. (JlJ and evaluate the average strength 
of the partition correlations. This process quantitatively 
estimates the robustness of the best solution(s) by sam- 
pling the complexity of the energy landscape. 

Generally speaking, poorer correlations occur when 
there are contending partitions of comparable strength 
[i.e., the energy difference of the applied cost function is 
near zero], the resolution is inside a "glassy" phase (ex- 
traneous intercommunity edges obscure the dynamic pro- 
cess of locating the best solution) , or the graph is more 
random in nature. In the case of contending partitions, 
local multiresolution methods, such as the one presented 
in the current work, may be able to reliably extract the 
well-defined communities. 

We quantify the partition correlations using informa- 
tion theoretic (or other appropriate) measures (see Sec. 
IV A). If most or all solvers (replicas) agree on the best 



solution, then we rate the partition as "strongly" corre- 
lated, but if the partitions have large variations, we say 
the solution is "weak." In either case, we select the lowest 
energy replica solution to represent the best answer at a 
given resolution 7$, but one could also construct a "con- 
sensus" partition [321 [63] IH] , particularly in the latter 
case of weak solutions [55] . 

As a function of the resolution parameter 7 in Eq. 
|l]) (or any relevant CD scale parameter for another 
model [HJ|43]), the best resolutions may be identified by 
peaks or plateaus in NMI [BJ, minima or plateuas in VI 
[BJ 33], and/or plateaus in the number of clusters q [33] 
or other measures [BJ [44] . Plateaus in these measures 
(ie.Q, NMI, VI, H, q, etc.) as a function of 7 imply 
more "stable" features of the network, although caution 
must be exercised when interpreting some measures 



[BJ. Sharper peaks in NMI or narrow troughs in VI 
indicate strongly defined but more transient features. 
Significant peaks in VI or troughs in NMI generally 
indicate transitions between dominant structures. More 
generally, we can further extract pertinent details of 
the network from other extrema in NMI and VI (e.g., 
Ref. |10j also analyzed peaks in VI to perform image 
segmentation using CD concepts). 

The MRA algorithm is: 

(0) Initialize the algorithm. Select the number of inde- 
pendent replicas r. Identify the set of resolutions {7.;} to 
analyze using Eq. ([I]) along with a starting 70 . It is often 
convenient to begin at high gamma and step downward, 
stopping if the system completely collapses. 

(1) Initialize the system. For the current 7^ initialize 
each replica with a unique set of N spin indices (i.e., 
go = N for each replica j). 

(2) Solve each replica. Independently solve each replica 
according to the CD algorithm in Appendix [B] 

(3) Compare all replicas. Calculate the Shannon en- 
tropy for every replica and compare all pairs of replicas 
using the mutual information I(A,B), normalized mu- 
tual information U(A,B), and variation of information 



V(A, B) measures in Sec. IV A 



(4) Iterate to the next resolution. Increment to the 
next resolution 7,-+i. A geometric step size A7 = lO 1 /" 
is often convenient where s « O(10) is an integer number 
of 7i's per decade of 7. Repeat steps (l)-(3) until the 
system is fully collapsed (if stepping down in 7,) or no 
7i's remain. 

The information correlations in steps (3) and (4) al- 
low the determination of the best global network scale(s) 
[BJ (see Appendix [A]) based upon regions of 7 with high 
NMI or low VI. Plateaus in I and q may also provide sup- 
plemental information regarding partition stability. The 
solution cost scales linearly in r with the CD algorithm 
in Appendix [b[ 0(rtL 13 log k). We have solved systems 
with O(10 7 ) edges on a single processor [BJ in a few hours. 

The algorithm may detect, but does not impose, a 
strictly hierarchical community structure. That is, as 
shown in Sec. |VI A[ the MRA algorithm will show 
strongly correlated regions at the well-defined hierarchi- 
cal levels, but it is also able to analyze non-hierarchical 
multiresolution structure. This approach is somewhat 
preferable over forcing a hierarchical structure on every 
analyzed network [35] since some networks may not natu- 
rally possess this type of organization. Once the preferred 
resolutions are identified, the specific hierarchical nature 
can be analyzed and evaluated by other means [661 167) . 



Appendix D: Semi- metric property of CVI 

A semi-metric possesses intuitive "distance-like" prop- 
erties for comparing cluster similarity. The proof that 
CVI is a semi-metric is trivial. A measure S(a, b) on a 
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set X with two variables a and b in X is a semi-metric if 
and only if it satisfies the following conditions: 

• Non-negativity - S(a, b) > for all a and b. 

• Zero only for equality - S(a, b) = only if a = b. 

• Symmetry - S(a, b) = S(b, a) for all a and b. 

S(a, b) is a metric if it additionally satisfies the triangle 
inequality S(a, c) < S(a, b) + S(b, c) for three variables a, 
6, and c in X. 



Since n a b is necessarily equal to n ba , I ab (A,B) is sym- 
metric in clusters (a, A) and (&, -B). Symmetry of u(a, &) 
is then immediately obvious. 

Thus, CVI is a semi-metric. □ 



We have not proved the triangle inequality for CVI, mak- 
ing it a metric, but the triangle inequality appears to be 
violated rarely, if at all. 



Appendix E: Alternate cluster measures 



Theorem 1. CVI in Eq. (i| is a semi-metric between 
two clusters a and b in partitions A and B of size \A\ = 
\B\ = N in the space of possible partitions of the N nodes: 
(1) It is non-negative and equal to zero only if a = b. (2) 
It is symmetric with respect to clusters (a, A) and (b,B), 
v(a, b) = v(b, a). 

Proof. 

(1) It is non- negative and strictly equal to zero only if 
a = b. From Eq. Q 

f u\ U a l ( n a\ n b l ( n b\ 

, M ) = -_log(^j--l g(-] 

'n ab N s 



-2— log 

Jy \ n a n b 



n ab 



loe 



N 



n b - n ab 
N 



N 

n ab , / n a \ n ab j n b 
—jrr log + — log 



loe 



n b J 



v(a,b) > 



(Dl) 



since n a > 0, n b > 0, n ab > 0, n a > n ab , and n b > n ab . 
Furthermore, it is zero only when, n a = n b = n ab . That 
is, it is zero when a — b. 

(2) It is symmetric with clusters (a, A) and (b,B), 
v(a, b) = v(b, a). 



A tempting alternate measure for CVI might be de- 
fined based on the individual terms of 



V(A,B) = H{A\B) + H{B\A) 



E 

a . b 



n ab , n b 
N n ab 



n ab , n a 

-jrr log 

N n ab 



(El) 



From this equivalent variant of VI, the natural CVI def- 
inition would be 



v' ab (AB) 



n ab , n a 

^rrlog 

N n ab 



n ab , n b 

irrlog ■ 

N n ab 



(E2) 



Unlike CVI in Eq. M, Eq. |E2]) has the nice prop- 



erty that the individual cluster contributions sum to VI, 

V(A, B) = J2T YT u ( a > b Y- 

Unfortunately, this particular launching point does not 
work for cluster comparisons. While v' aa (A, A) = as 
desired, it is also the case that v' ab = if n ab = 0. That 
is, it is zero if no overlap exists between a and b which 
violates the notion of a "distance" as well as one of the 
requirements for being a (semi)mctric. VI is a metric on 
partitions A and B because it sums over all a and b in A 
and B, respectively. 

We could also consider an alternate ad hoc definition 



by redefining the CVI entropy terms in Eq. ( 10 1 accord- 
ing to v(a,b)" = H a (A)/q B + H b (B)/q A - 2I ab (A,B). 
This variant would again yield the desirable property 
V(A, B) = YT YT v ( a ' but the measure loses the 
semi-metric requirements v(a, b)" > and u(a, a)" = 0. 
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